Enhanced Back-Translation for Low Resource Neural Machine Translation Using Self-training
نویسندگان
چکیده
Improving neural machine translation (NMT) models using the back-translations of monolingual target data (synthetic parallel data) is currently state-of-the-art approach for training improved systems. The quality backward system - which trained on available and used back-translation has been shown in many studies to affect performance final NMT model. In low resource conditions, usually not enough train a model that can produce qualitative synthetic needed standard This work proposes self-training strategy where output improve itself through forward technique. technique was baseline IWSLT'14 English-German IWSLT'15 English-Vietnamese by 11.06 1.5 BLEUs respectively. generated out-performed another 2.7 BLEU.
منابع مشابه
Neural machine translation for low-resource languages
Neural machine translation (NMT) approaches have improved the state of the art in many machine translation settings over the last couple of years, but they require large amounts of training data to produce sensible output. We demonstrate that NMT can be used for low-resource languages as well, by introducing more local dependencies and using word alignments to learn sentence reordering during t...
متن کاملMultilingual Neural Machine Translation for Low Resource Languages
Neural Machine Translation (NMT) has been shown to be more effective in translation tasks compared to the Phrase-Based Statistical Machine Translation (PBMT). However, NMT systems are limited in translating low-resource languages (LRL), due to the fact that neural methods require a large amount of parallel data to learn effective mappings between languages. In this work we show how so-called mu...
متن کاملData Augmentation for Low-Resource Neural Machine Translation
The quality of a Neural Machine Translation system depends substantially on the availability of sizable parallel corpora. For low-resource language pairs this is not the case, resulting in poor translation quality. Inspired by work in computer vision, we propose a novel data augmentation approach that targets low-frequency words by generating new sentence pairs containing rare words in new, syn...
متن کاملUniversal Neural Machine Translation for Extremely Low Resource Languages
In this paper, we propose a new universal machine translation approach focusing on languages with a limited amount of parallel data. Our proposed approach utilizes a transferlearning approach to share lexical and sentences level representations across multiple source languages into one target language. The lexical part is shared through a Universal Lexical Representation to support multilingual...
متن کاملTransfer Learning for Low-Resource Neural Machine Translation
The encoder-decoder framework for neural machine translation (NMT) has been shown effective in large data scenarios, but is much less effective for low-resource languages. We present a transfer learning method that significantly improves BLEU scores across a range of low-resource languages. Our key idea is to first train a high-resource language pair (the parent model), then transfer some of th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Communications in computer and information science
سال: 2021
ISSN: ['1865-0937', '1865-0929']
DOI: https://doi.org/10.1007/978-3-030-69143-1_28